Distributed checkpointing based on influential messages
نویسندگان
چکیده
In distributed applications, a group of multiple objects are cooperated to achieve some objectives. The computation on the objects are based on the massage passing, i.e. remote procedure call. The objects may su er from kinds of faults. In the presence of the object faults, the states of the objects in the system have to be kept consistent. If some object o is faulty, o is rolled back to the checkpoint and objects which have received messages from o are also required to be rolled back. In this paper, we de ne in uential messages whose receivers are required be rolled back from the application point of view if the senders are rolled back on the basis of the message semantics. By using the in uential messages, we would like to de ne a signi cant checkpoint which denotes a consistent global state of the system but might be inconsistent from the traditional de nition. We would like to present protocols for taking the signi cant checkpoint and for rolling back the objects by using the in uential messages.
منابع مشابه
A New Checkpointing Approach for Mobile Distributed System
In this paper, we introduce a weighted checkpointing approach for the mobile distributed computing system (MDCS) that significantly reduces checkpointing overheads on mobile nodes. Checkpoint protocols proposed so far in the literature for MDCS are either coordinated, log based or quasi-synchronous. Coordinated checkpointing requires extra synchronization messages and may block the underlying c...
متن کاملAn Enhanced MSS-based checkpointing Scheme for Mobile Computing Environment
Mobile computing systems are made up of different components among which Mobile Support Stations (MSSs) play a key role. This paper proposes an efficient MSS-based non-blocking coordinated checkpointing scheme for mobile computing environment. In the scheme suggested nearly all aspects of checkpointing and their related overheads are forwarded to the MSSs and as a result the workload of Mobile ...
متن کاملAnti-message Logging based Check Pointing Algorithm for Mobile Distributed Systems
Checkpointing is one of the commonly used techniques to provide fault tolerance in distributed systems so that the system can operate even if one or more components have failed. However, mobile computing systems are constrained by low bandwidth, mobility, lack of stable storage, frequent disconnections and limited battery life. Hence checkpointing protocols which have fewer checkpoints are pref...
متن کاملRollback Recovery Scheme for Distributed Shared Memory Clusters
In this paper, an unified lightweight error recovery scheme based on coordinated checkpointing and rollback for distributed shared memory clusters is proposed. The new scheme maintains multiple globally consistent checkpoints of the state of a distributed shared memory cluster and recovers to a pre-fault checkpoint of the system. It also describes and evaluates the coordinated checkpointing. Th...
متن کاملAnti-message Logging Based Coordinated Checkpointing Protocol for Deterministic Mobile Computing Systems
A checkpoint algorithm for mobile computing systems needs to handle many new issues like: mobility, low bandwidth of wireless channels, lack of stable storage on mobile nodes, disconnections, limited battery power and high failure rate of mobile nodes. These issues make traditional checkpointing techniques unsuitable for such environments. Minimum-process coordinated checkpointing is an attract...
متن کامل